Skip to content

[monitoring] fix: use retrying module in the fixture class#3285

Merged
tmatsuo merged 7 commits into
GoogleCloudPlatform:masterfrom
tmatsuo:fix-monitoring
Apr 7, 2020
Merged

[monitoring] fix: use retrying module in the fixture class#3285
tmatsuo merged 7 commits into
GoogleCloudPlatform:masterfrom
tmatsuo:fix-monitoring

Conversation

@tmatsuo

@tmatsuo tmatsuo commented Apr 6, 2020

Copy link
Copy Markdown
Contributor

fixes #2971
fixes #2972
fixes #2973
fixes #3085

It will likely fix those issues, not guaranteed, but it's worth a try.

fixes GoogleCloudPlatform#2971
fixes GoogleCloudPlatform#2972
fixes GoogleCloudPlatform#2973
fixes GoogleCloudPlatform#3085

It will likely fix those issues, not guaranteed, but it's worth a try.
@tmatsuo tmatsuo requested review from kurtisvg and leahecole April 6, 2020 22:42
@tmatsuo tmatsuo requested a review from a team as a code owner April 6, 2020 22:42
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Apr 6, 2020
@@ -1 +1,3 @@
pytest==5.3.2
gcp-devrel-py-tools==0.0.15
google-cloud-core

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to pin a version?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a temporary workaround. It'll go away.
(Actually, I'll use flaky, so those 2 lines will go away)

import random
import string

from gcp_devrel.testing import eventually_consistent

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I'd prefer we stop relying on this module altogether. It's not maintained and a lot of the logic isn't really great for testing.

IIRC, eventually_consistent just attempts the function until it passes. This could make the issue of failures during concurrent tests worse. There are other libraries (like flaky) that do this better and don't add extra maintenance on us.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kurtisvg
Flaky sounds like a plan. I'll use flaky for this test.

But, I kinda disagree with your two points; for the first point "It's not maintained", we can maintain it, right? I'm happy to do that from now on. For the second point, "lot of the logic isn't really great for testing." Can you elaborate? Maybe you can file a bug on the repo?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think eventually_consistent is good for eventually consistent tests. It is basically a wrapper around retrying module with default exponential backoff parameters.

But I agree that flaky is more suitable with this test.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE: maintenance - it doesn't seem worth the effort of us maintaining and releasing a special module just for testing our samples. Are our libraries really so hard to test that we can't find an out of the box solution to reach reasonable tests? (I think the answer is no)

RE: logic - IMO, flakey tests are bad tests. Running a test several times hoping at least one passes means that potentially, our users also have to run the sample multiple times to make sure that it's working. Ideally, the snippet should work the first time, every time.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kurtisvg Thanks for the discussion. Let's discuss further off PR :)

@tmatsuo

tmatsuo commented Apr 6, 2020

Copy link
Copy Markdown
Contributor Author

@kurtisvg Thanks, now it's using flaky. PTAL

@tmatsuo tmatsuo changed the title fix: mark tests with @eventually_consistent.mark fix: mark tests with @flaky Apr 6, 2020
@tmatsuo

tmatsuo commented Apr 7, 2020

Copy link
Copy Markdown
Contributor Author

Ah, as far as I read the logs correctly, flaky doesn't retry.

Oh I just found the pytest marker. Another commit is on the way.

@tmatsuo tmatsuo changed the title fix: mark tests with @flaky fix: mark tests with @pytest.mark.flaky Apr 7, 2020
@tmatsuo

tmatsuo commented Apr 7, 2020

Copy link
Copy Markdown
Contributor Author

Oh I think I started to understand the error better.
The log says:

ERROR at setup of test_list_alert_policies

This is failing in the fixture setup. I think I'm going to retry the setup with retrying module with an exponential backoffs.

@kurtisvg

kurtisvg commented Apr 7, 2020

Copy link
Copy Markdown
Contributor

@kurtisvg Thanks, now it's using flaky. PTAL

@tmatsuo You may also want to want to to try a retry filter with a pause. Something like this:

def retry_on_concurrent_edits(err, *args):
    time.sleep(1)
    return "Too many concurrent edits to the project configuration" in str(err)

@flaky(rerun_filter=delay_rerun)
def test():

I'm not sure if fixtures are recreated on retry, but that should at least give a pause in between different attempts (and only retry if it's the 409 error).

@tmatsuo tmatsuo changed the title fix: mark tests with @pytest.mark.flaky fix: use retrying module in the fixture class Apr 7, 2020
@tmatsuo

tmatsuo commented Apr 7, 2020

Copy link
Copy Markdown
Contributor Author

@kurtisvg Now I use retrying module in the fixture class.

I hope it will fix the concurrent tests. Let's see the kokoro result.

@tmatsuo

tmatsuo commented Apr 7, 2020

Copy link
Copy Markdown
Contributor Author

Ah, interesting. The latest retrying module doesn't support the tuple. I'll make a fix.

@tmatsuo

tmatsuo commented Apr 7, 2020

Copy link
Copy Markdown
Contributor Author

Alright, the tests are now passing. Previously, one of the tests was constantly failing, so it's certainly an improvement. I'll re-run the tests few times to see how stable they are.

Note: this is not an ultimate fix because when we have more and more builds, it will cause problems. Also, the presubmit builds might conflict with continuous builds, etc etc.

I'll file another issue for the fundamental fix.

@kurtisvg PTAL

@tmatsuo tmatsuo added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2020
@tmatsuo tmatsuo added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2020
@tmatsuo tmatsuo added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2020
@tmatsuo

tmatsuo commented Apr 7, 2020

Copy link
Copy Markdown
Contributor Author

Tests were all green 3 times in a row, but now we have another faiulre.

It seems like we need to retry in the test itself, and the teardown. Another commit is on the way.

@tmatsuo tmatsuo added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2020
@tmatsuo tmatsuo added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2020
@tmatsuo tmatsuo added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2020
@tmatsuo tmatsuo added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2020
@tmatsuo

tmatsuo commented Apr 7, 2020

Copy link
Copy Markdown
Contributor Author

Tests are all green 5 times in a row.
@leahecole PTAL

@tmatsuo tmatsuo changed the title fix: use retrying module in the fixture class [monitoring] fix: use retrying module in the fixture class Apr 7, 2020
@tmatsuo tmatsuo mentioned this pull request Apr 7, 2020

@leahecole leahecole left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh awesome love that we're moving to retry instead

@tmatsuo tmatsuo merged commit 5c134ec into GoogleCloudPlatform:master Apr 7, 2020
@tmatsuo tmatsuo deleted the fix-monitoring branch April 7, 2020 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes This human has signed the Contributor License Agreement.

Projects

None yet

6 participants